15 research outputs found

    Model-based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration

    Full text link
    In this paper, we present a robotic model-based reinforcement learning method that combines ideas from model identification and model predictive control. We use a feature-based representation of the dynamics that allows the dynamics model to be fitted with a simple least squares procedure, and the features are identified from a high-level specification of the robot's morphology, consisting of the number and connectivity structure of its links. Model predictive control is then used to choose the actions under an optimistic model of the dynamics, which produces an efficient and goal-directed exploration strategy. We present real time experimental results on standard benchmark problems involving the pendulum, cartpole, and double pendulum systems. Experiments indicate that our method is able to learn a range of benchmark tasks substantially faster than the previous best methods. To evaluate our approach on a realistic robotic control task, we also demonstrate real time control of a simulated 7 degree of freedom arm.Comment: 8 page

    Safety, Risk Awareness and Exploration in Reinforcement Learning

    No full text
    Replicating the human ability to solve complex planning problems based on minimal prior knowledge has been extensively studied in the field of reinforcement learning. Algorithms for discrete or approximate models are supported by theoretical guarantees but the necessary assumptions are often constraining. We aim to extend these results in the direction of practical applicability to more realistic settings. Our contributions are restricted to three specific aspects of practical problems that we believe to be important when applying reinforcement learning techniques: risk awareness, safe exploration and data efficient exploration. Risk awareness is important in planning situations where restarts are not available and performance depends on one-off returns rather than average returns. The expected return is no longer an appropriate objective because the law of large numbers does not apply. In Chapter 2 we propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties, relating it to previously proposed risk-aware objectives: minmax, exponential utility, percentile and mean minus variance. In environments with uncertain dynamics, exploration is often necessary to improve performance. Existing reinforcement learning algorithms provide theoretical exploration guarantees, but they tendto rely on the assumption that any state is eventually reachable from any other state by following a suitable policy. For most physical systems this assumption is impractical as the systems would break before any reasonable exploration has taken place. In Chapter 3 weaddress the need for a safe exploration method. In Chapter 4 we address the specific challenges presented by extending model-based reinforcement learning methods from discrete to continuous dynamical systems. System representations based on explicitly enumerated states are not longer applicable. To address this challenge we use a Dirichlet process mixture of linear models to represent dynamics. The proposed model strikes a good balance between compact representation and flexibility. To address the challenge of efficient exploration-exploitation trade-off we apply the principle of Optimism in the Face of Uncertainty that underlies numerous other provably efficient algorithms in simpler settings. Our algorithm reduces the exploration problem to a sequence of classical optimal control problems. Synthetic experiments illustrate the effectiveness of our methods

    Entomofauna of the Linaria vulgaris Mill.

    No full text
    From the plants were collected a lot of insects: Chrysopa carnea (Neuroptera), Exolygus rugulipennis, Holcosthetus vernalis (Heteroptera), Cymnetron antirrhini, G. tetrum, Pseudathous rufipes, Vadonia livida (Coleoptera). From flowers were collected the following species: Taenyothrips linariae (Thysanoptera) and Meligethes aeneus (Coleoptera). From capsels were reared the species: Cymnetron antirrhini, G. tetrum (Coleoptera), Cochylis posterana, C. hybridella, Eupoecilia angustana, Balseuncaria ciliella and Eupithecia linariata (Lepidoptera). From galv, developed on the roots of the plant, was reared the wecvil Cymnetron collinum (Coleoptera). There was made also some biological and ecological considerations about the Cymnetron species, heese being more important factors to reduce the multiplication capacity of Linaria spp

    Denoising archival films using a learned bayesian model

    No full text
    We develop a Bayesian model of digitized archival films and use this for denoising, or more specifically de-graining, individual frames. In contrast to previous approaches our model uses a learned spatial prior and a unique likelihood term that models the physics that generates the image grain. The spatial prior is represented by a high-order Markov random field based on the recently proposed Field-of-Experts framework. We propose a new model of the image grain in archival films based on an inhomogeneous beta distribution in which the variance is a function of image luminance. We train this noise model for a particular film and perform de-graining using a diffusion method. Quantitative results show improved signalto-noise ratio relative to the standard ad hoc Gaussian noise model. Index Terms — Image restoration, optical film, noise 1

    Gastric Adenocarcinoma Associated with Acute Endocarditis of the Aortic Valve and Coronary Artery Disease in a 61-Year-Old Male with Multiple Comorbidities—Combined Surgical Management—Case Report

    No full text
    The case of a 61-year-old male with a recent total gastrectomy for a hemorrhagic gastric tumor is presented, with the important co-morbidities of type II diabetes mellitus requiring insulin, chronic hepatitis C with liver dysfunction, stage II essential hypertension, chronic stage III renal disease peripheral type II aorto-iliac disease with stage II ischemia of both legs, and chronic anemia. About one month following the gastrectomy, the patient presented with fever and acute inflammatory syndrome. Severe aortic insufficiency, aortic valvular vegetations, and positive blood cultures with Staphylococcus saprophytic were found. The diagnosis of infectious endocarditis on the aortic valve was established (positive blood cultures with echocardiographic features of vegetations, fever), and antibiotic treatment with Levofloxacin and Vancomycin was initiated. The evolution was favorable with the remission of the inflammatory syndrome and quick cessation of fever. However, the hemodynamic aspect showed progressive heart failure with acute pulmonary edema. The transesophageal echocardiographic examination confirmed the existence of severe aortic insufficiency and valvular vegetations with a left ventricular ejection fraction of 38%. The coronary angiography revealed double vessel disease. The calculated Euroscore II was 33.4%. Aortic valve replacement with porcine xenograft and double coronary artery bypass graft surgery was performed. The patient had a favorable postoperative course remaining afebrile and out of heart failure, with the markers of inflammation largely within normal limits. The left ventricular ejection fraction increased to 50%. The successful outcome of this case, represented by a rare association of cancer, endocarditis, and coronary disease, reveals the importance of the multidisciplinary teams involved in this case: gastroenterology, general surgery, cardiology, infectious diseases, cardiac surgery, and intensive care. Therefore, in such cases with high risk, complex patients, a strong collaboration between all specialties is needed to overcome all of the limitations of the patient’s co-morbidities

    Mechanochemical activation of copper concentrate and the effect on oxidation of metal sulphides

    No full text
    This work presents the effect of mechanochemical activation in an attrition mill, in water medium and for different time internals, on the particle size distribution and microstructure of copper concentrate as well as, on the oxidation of the metal sulphides after treatment in an autoclave. Results show that the mean particle size decreased after 30 minutes of milling almost 10 times and the specific surface increased from 0.1 to 4.3 m2/g. Regarding the micro-structural changes, it was found that during the mechanochemical activation the average crystallite size of chalcopyrite decreased, following an exponential trend towards a limiting value of approximately 20 nm, assuming spherical or equiaxed crystallites. The enhanced structural disorder of chalcopyrite is also highlighted by the linear increase of lattice strain with the milling time. Finally, results from the leaching experiments, demonstrated that the mechanical treatment improved the oxidation of sulphides by lowering the reaction temperature and increasing the reaction rates. The above data suggest that the mechanochemical activation of copper concentrate is an efficient method to enhance the hydrometallurgical oxidation of copper concentrate and chalcopyrite in particular.status: publishe
    corecore